Ocala
SCOP: Evaluating the Comprehension Process of Large Language Models from a Cognitive View
Xiao, Yongjie, Liang, Hongru, Qin, Peixin, Zhang, Yao, Lei, Wenqiang
Despite the great potential of large language models(LLMs) in machine comprehension, it is still disturbing to fully count on them in real-world scenarios. This is probably because there is no rational explanation for whether the comprehension process of LLMs is aligned with that of experts. In this paper, we propose SCOP to carefully examine how LLMs perform during the comprehension process from a cognitive view. Specifically, it is equipped with a systematical definition of five requisite skills during the comprehension process, a strict framework to construct testing data for these skills, and a detailed analysis of advanced open-sourced and closed-sourced LLMs using the testing data. With SCOP, we find that it is still challenging for LLMs to perform an expert-level comprehension process. Even so, we notice that LLMs share some similarities with experts, e.g., performing better at comprehending local information than global information. Further analysis reveals that LLMs can be somewhat unreliable -- they might reach correct answers through flawed comprehension processes. Based on SCOP, we suggest that one direction for improving LLMs is to focus more on the comprehension process, ensuring all comprehension skills are thoroughly developed during training.
- North America > United States > Florida > Marion County > Ocala (0.14)
- North America > United States > South Carolina > Greenville County > Wade Hampton (0.14)
- North America > United States > Florida > Miami-Dade County > Tamiami (0.14)
- (27 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- (3 more...)
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Cho, Jaemin, Mahata, Debanjan, Irsoy, Ozan, He, Yujie, Bansal, Mohit
Document visual question answering (DocVQA) pipelines that answer questions from documents have broad applications. Existing methods focus on handling single-page documents with multi-modal language models (MLMs), or rely on text-based retrieval-augmented generation (RAG) that uses text extraction tools such as optical character recognition (OCR). However, there are difficulties in applying these methods in real-world scenarios: (a) questions often require information across different pages or documents, where MLMs cannot handle many long documents; (b) documents often have important information in visual elements such as figures, but text extraction tools ignore them. We introduce M3DocRAG, a novel multi-modal RAG framework that flexibly accommodates various document contexts (closed-domain and open-domain), question hops (single-hop and multi-hop), and evidence modalities (text, chart, figure, etc.). M3DocRAG finds relevant documents and answers questions using a multi-modal retriever and an MLM, so that it can efficiently handle single or many documents while preserving visual information. Since previous DocVQA datasets ask questions in the context of a specific document, we also present M3DocVQA, a new benchmark for evaluating open-domain DocVQA over 3,000+ PDF documents with 40,000+ pages. In three benchmarks (M3DocVQA/MMLongBench-Doc/MP-DocVQA), empirical results show that M3DocRAG with ColPali and Qwen2-VL 7B achieves superior performance than many strong baselines, including state-of-the-art performance in MP-DocVQA. We provide comprehensive analyses of different indexing, MLMs, and retrieval models. Lastly, we qualitatively show that M3DocRAG can successfully handle various scenarios, such as when relevant information exists across multiple pages and when answer evidence only exists in images.
- Europe > Spain > Galicia > Madrid (0.06)
- Europe > Spain > Balearic Islands > Mallorca (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (9 more...)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Leisure & Entertainment > Sports > Horse Racing (1.00)
- Leisure & Entertainment > Games > Computer Games (0.68)
MiMiC: Minimally Modified Counterfactuals in the Representation Space
Singh, Shashwat, Ravfogel, Shauli, Herzig, Jonathan, Aharoni, Roee, Cotterell, Ryan, Kumaraguru, Ponnurangam
Language models often exhibit undesirable behaviors, such as gender bias or toxic language. Interventions in the representation space were shown effective in mitigating such issues by altering the LM behavior. We first show that two prominent intervention techniques, Linear Erasure and Steering Vectors, do not enable a high degree of control and are limited in expressivity. We then propose a novel intervention methodology for generating expressive counterfactuals in the representation space, aiming to make representations of a source class (e.g., "toxic") resemble those of a target class (e.g., "non-toxic"). This approach, generalizing previous linear intervention techniques, utilizes a closed-form solution for the Earth Mover's problem under Gaussian assumptions and provides theoretical guarantees on the representation space's geometric organization. We further build on this technique and derive a nonlinear intervention that enables controlled generation. We demonstrate the effectiveness of the proposed approaches in mitigating bias in multiclass classification and in reducing the generation of toxic language, outperforming strong baselines.
- North America > United States > Florida > Marion County > Ocala (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Does Outrage Signal Cyber Attacks? Predicting "Bad Behavior" from Sentiment in Online Content
Hollingshead, Kristy (Florida Institute for Human and Machine Cognition) | Dorr, Bonnie J. (Florida Institute for Human and Machine Cognition) | Dalton, Adam (Florida Institute for Human and Machine Cognition) | Barton, Meg (Leidos, Inc.)
We demonstrate that it is possible to leverage big data in the form of tweets and linked webpages to find expressions of sentiment that signal "bad behavior" such as cyber attacks. We hypothesize that expressions of "outrage" (high intensity, negative affect sentiment) against an organization in public data may be predictive of cyber attacks for two reasons: 1) threat actors may be motivated to launch an attack based on anger/discontent, and 2) outrage associated with an organization or industry may increase the likelihood of that organization or industry being victimized by threat actors (i.e., as a form of "vigilante justice"). We measure sentiment in online content and determine trends in public emotion and their correlation to trends in cyber attacks, as reported in Hackmageddon. We demonstrate that dimensions of sentiment, as afforded by our use of the Circumplex model of emotion, do yield correlations to reported cyber attacks, but differ dependent upon the domain of the data. Thus the use of this technique requires careful analysis for optimal application.
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > Florida > Marion County > Ocala (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Data Science > Data Mining (0.89)
- (2 more...)
Cyberdyne's HAL Exoskeleton Helps Patients Walk Again in First Treatments at U.S. Facility
Danny Bal was riding his brand new motorcycle to work from his home in Ocala, Florida two years ago when the driver of an oncoming car fell asleep and ploughed into Bal's electric-blue bike. After the accident, which crushed three of Bal's thoracic vertebrae and shredded a spinal nerve, Bal adjusted to life in a wheelchair. He added a motorized lift to his beloved F-250 truck, explored local trails with a hand-powered bike, and joined a therapeutic horseback riding program. Now, one of Bal's daughters is about to get married, and 57-year-old Bal wants to walk in her ceremony. So on a recent Friday morning in December at Brooks Rehabilitation in Jacksonville, Florida, Bal was back on his feet, taking slow but steady steps as his granddaughter cheered from the sidelines.
- North America > United States > Florida > Marion County > Ocala (0.25)
- North America > United States > Florida > Duval County > Jacksonville (0.25)
- North America > United States > Florida > Volusia County > Daytona Beach (0.05)
- Europe > Spain (0.05)
- Health & Medicine > Health Care Providers & Services (0.98)
- Health & Medicine > Therapeutic Area > Neurology (0.95)
An Ostrich-Like Robot Pushes the Limits of Legged Locomotion
What looks like a tiny mechanical ostrich chasing after a car is actually a significant leap forward for robot-kind. The clever and simple two-legged robot, known as the Planar Elliptical Runner, was developed at the Institute for Human and Machine Cognition in Ocala, Florida, to explore how mechanical design can be used to enable sophisticated legged locomotion. A video produced by the researchers shows the robot being tested in a number of situations, including on a treadmill and running behind and alongside a car with a helping hand from an engineer. In contrast to many other legged robots, this one doesn't use sensors and a computer to help balance itself. Instead, its mechanical design provides dynamic stability as it runs.
- North America > United States > Florida > Marion County > Ocala (0.26)
- North America > United States > Oregon (0.06)
- North America > United States > Michigan (0.06)
Government regulators are looking into fatal Tesla crash involving Autopilot
Tesla announced today that the National Highway Traffic Safety Administration has opened an investigation into a recent fatal crash of a Model S with the company's Autopilot feature activated. The accident took place on May 7th in a small West Florida town called Williston. The Florida Highway Patrol is also conducting its own investigation of the accident, according to a public affairs officer there. The same officer reported that Tesla has, since the fatal accident in May, sent engineers down to Ocala, Florida to assist investigators in accessing data they needed to evaluate the causes of the crash. Tesla offered an account of the event in a blog post titled "A Tragic Loss" that went up today, detailing the crash, an "extremely rare circumstance," which occurred on a divided highway.
- North America > United States > Florida > Marion County > Ocala (0.26)
- North America > United States > Ohio > Stark County > Canton (0.06)
- Transportation > Ground > Road (1.00)
- Government > Regional Government > North America Government > United States Government (0.37)
Cognitive Orthoses: Toward Human-Centered AI
Ford, Kenneth M. (Florida Institute for Human and Machine Cognition (IHMC)) | Hayes, Patrick J. (Florida Institute for Human and Machine Cognition (IHMC)) | Glymour, Clark (Florida Institute for Human and Machine Cognition (IHMC)) | Allen, James (Florida Institute for Human and Machine Cognition (IHMC))
This introduction focuses on how human-centered computing (HCC) is changing the way that people think about information technology. The AI perspective views this HCC framework as embodying a systems view, in which human thought and action are linked and equally important in terms of analysis, design, and evaluation. This emerging technology provides a new research outlook for AI applications, with new research goals and agendas.
- North America > United States > Florida > Escambia County > Pensacola (0.05)
- North America > United States > Ohio (0.05)
- North America > United States > Florida > Marion County > Ocala (0.05)
- Europe > France (0.05)
- Government (0.70)
- Health & Medicine > Therapeutic Area (0.30)
Speech Adaptation in Extended Ambient Intelligence Environments
Dorr, Bonnie J. (Institute for Human and Machine Cognition) | Galescu, Lucian (Institute for Human and Machine Cognition) | Perera, Ian (Institute for Human and Machine Cognition) | Hollingshead-Seitz, Kristy (Institute for Human and Machine Cognition) | Atkinson, David (Institute for Human and Machine Cognition) | Clark, Micah (Institute for Human and Machine Cognition) | Clancey, William (Institute for Human and Machine Cognition) | Wilks, Yorick ( Institute for Human and Machine Cognition ) | Fosler-Lussier, Eric (Ohio State University)
This Blue Sky presentation focuses on a major shift toward a notion of “ambient intelligence” that transcends general applications targeted at the general population. The focus is on highly personalized agents that accommodate individual differences and changes over time. This notion of Extended Ambient Intelligence (EAI) concerns adaptation to a person’s preferences and experiences, as well as changing capabilities, most notably in an environment where conversational engagement is central. An important step in moving this research forward is the accommodation of different degrees of cognitive capability (including speech processing) that may vary over time for a given user—whether through improvement or through deterioration. We suggest that the application of divergence detection to speech patterns may enable adaptation to a speaker’s increasing or decreasing level of speech impairment over time. Taking an adaptive approach toward technology development in this arena may be a first step toward empowering those with special needs so that they may live with a high quality of life. It also represents an important step toward a notion of ambient intelligence that is personalized beyond what can be achieved by mass-produced, one-size-fits-all software currently in use on mobile devices.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- (2 more...)
Companion-Based Ambient Robust Intelligence (CARING)
Dorr, Bonnie (IHMC) | Galescu, Lucian (IHMC) | Golob, Edward (Tulane University) | Venable, K. Brent (Tulane University / IHMC) | Wilks, Yorick (IHMC)
We present a Companion-based Ambient Robust INtelliGence (CARING) system, for communication with, and support of, clients with Traumatic brain injury (TBI) or Amyotrophic Lateral Sclerosis (ALS). A central component of this system is an artificial companion, combined with a range of elements for ambient intelligence. The companion acts as a personalized intermediary for multi-party communication between the client, the environment (e.g. a Smart Home), caregivers and health professionals. CARING is based on tightly coupled systems drawing from natural language processing, speech recognition and adaptation, deep language understanding and constraint-based knowledge representation and reasoning. A major innovation of the system is its ability to adapt and accommodate different interfaces associated with different client capabilities and needs. The system will use, as a proxy, different interaction requirements of clients (e.g., Brain-Computer Interfaces) at different stages of ALS progression and with different types of TBI impairments. Ultimately, this technology is expected to improve the quality of life for clients through conversation with a computer.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
- North America > United States > Florida > Marion County > Ocala (0.05)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.57)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.37)